Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets

نویسندگان

  • Juan Antonio Pérez-Ortiz
  • Felix A. Gers
  • Douglas Eck
  • Jürgen Schmidhuber
چکیده

The long short-term memory (LSTM) network trained by gradient descent solves difficult problems which traditional recurrent neural networks in general cannot. We have recently observed that the decoupled extended Kalman filter training algorithm allows for even better performance, reducing significantly the number of training steps when compared to the original gradient descent training algorithm. In this paper we present a set of experiments which are unsolvable by classical recurrent networks but which are solved elegantly and robustly and quickly by LSTM combined with Kalman filters.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Context Sensitive Languages with LSTM Trained with Kalman Filters

Unlike traditional recurrent neural networks, the Long ShortTerm Memory (LSTM) model generalizes well when presented with training sequences derived from regular and also simple nonregular languages. Our novel combination of LSTM and the decoupled extended Kalman filter, however, learns even faster and generalizes even better, requiring only the 10 shortest exemplars (n ≤ 10) of the context sen...

متن کامل

Improving Long-Term Online Prediction with Decoupled Extended Kalman Filters

Long Short-Term Memory (LSTM) recurrent neural networks (RNNs) outperform traditional RNNs when dealing with sequences involving not only short-term but also long-term dependencies. The decoupled extended Kalman filter learning algorithm (DEKF) works well in online environments and reduces significantly the number of training steps when compared to the standard gradient-descent algorithms. Prev...

متن کامل

Complex Extended Kalman Filters for Training Recurrent Neural Network Channel Equalizers

The Kalman filter was named after Rudolph E. Kalman published in 1960 his famous paper (Kalman, 1960) describing a recursive solution to the discrete-data linear filtering problem. There are several tutorial papers and books dealing with the subject for a great variety of applications in many areas from engineering to finance (Grewal & Andrews, 2001; Sorenson, 1970; Haykin, 2001; Bar-Shalom & L...

متن کامل

Convolutional LSTM Networks for Subcellular Localization of Proteins

Machine learning is widely used to analyze biological sequence data. Non-sequential models such as SVMs or feed-forward neural networks are often used although they have no natural way of handling sequences of varying length. Recurrent neural networks such as the long short term memory (LSTM) model on the other hand are designed to handle sequences. In this study we demonstrate that LSTM networ...

متن کامل

The Optimization of Forecasting ATMs Cash Demand of Iran Banking Network Using LSTM Deep Recursive Neural Network

One of the problems of the banking system is cash demand forecasting for ATMs (Automated Teller Machine). The correct prediction can lead to the profitability of the banking system for the following reasons and it will satisfy the customers of this banking system. Accuracy in this prediction are the main goal of this research. If an ATM faces a shortage of cash, it will face the decline of bank...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Neural networks : the official journal of the International Neural Network Society

دوره 16 2  شماره 

صفحات  -

تاریخ انتشار 2003